Search CORE

9 research outputs found

Genetic Sequence Matching Using D4M Big Data Approaches

Author: Dodson Stephanie
Kepner Jeremy
Ricke Darrell O.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/07/2014
Field of study

Recent technological advances in Next Generation Sequencing tools have led to increasing speeds of DNA sample collection, preparation, and sequencing. One instrument can produce over 600 Gb of genetic sequence data in a single run. This creates new opportunities to efficiently handle the increasing workload. We propose a new method of fast genetic sequence analysis using the Dynamic Distributed Dimensional Data Model (D4M) - an associative array environment for MATLAB developed at MIT Lincoln Laboratory. Based on mathematical and statistical properties, the method leverages big data techniques and the implementation of an Apache Acculumo database to accelerate computations one-hundred fold over other methods. Comparisons of the D4M method with the current gold-standard for sequence analysis, BLAST, show the two are comparable in the alignments they find. This paper will present an overview of the D4M genetic sequence algorithm and statistical comparisons with BLAST.Comment: 6 pages; to appear in IEEE High Performance Extreme Computing (HPEC) 201

arXiv.org e-Print Archive

Crossref

Rapid Sequence Identification of Potential Pathogens Using Techniques from Sparse Linear Algebra

Author: Chiu Nelson
Dodson Stephanie
Kepner Jeremy
Ricke Darrell O.
Shcherbina Anna
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/01/2015
Field of study

The decreasing costs and increasing speed and accuracy of DNA sample collection, preparation, and sequencing has rapidly produced an enormous volume of genetic data. However, fast and accurate analysis of the samples remains a bottleneck. Here we present D

^{4}

RAGenS, a genetic sequence identification algorithm that exhibits the Big Data handling and computational power of the Dynamic Distributed Dimensional Data Model (D4M). The method leverages linear algebra and statistical properties to increase computational performance while retaining accuracy by subsampling the data. Two run modes, Fast and Wise, yield speed and precision tradeoffs, with applications in biodefense and medical diagnostics. The D

^{4}

RAGenS analysis algorithm is tested over several datasets, including three utilized for the Defense Threat Reduction Agency (DTRA) metagenomic algorithm contest

arXiv.org e-Print Archive

Crossref

A Linear Algebra Approach to Fast DNA Mixture Analysis Using GPUs

Author: Helfer Brian
Kepner Jeremy
Reuther Albert
Ricke Darrell O.
Samsi Siddharth
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/07/2017
Field of study

Analysis of DNA samples is an important step in forensics, and the speed of analysis can impact investigations. Comparison of DNA sequences is based on the analysis of short tandem repeats (STRs), which are short DNA sequences of 2-5 base pairs. Current forensics approaches use 20 STR loci for analysis. The use of single nucleotide polymorphisms (SNPs) has utility for analysis of complex DNA mixtures. The use of tens of thousands of SNPs loci for analysis poses significant computational challenges because the forensic analysis scales by the product of the loci count and number of DNA samples to be analyzed. In this paper, we discuss the implementation of a DNA sequence comparison algorithm by re-casting the algorithm in terms of linear algebra primitives. By developing an overloaded matrix multiplication approach to DNA comparisons, we can leverage advances in GPU hardware and algoithms for Dense Generalized Matrix-Multiply (DGEMM) to speed up DNA sample comparisons. We show that it is possible to compare 2048 unknown DNA samples with 20 million known samples in under 6 seconds using a NVIDIA K80 GPU.Comment: Accepted for publication at the 2017 IEEE High Performance Extreme Computing conferenc

arXiv.org e-Print Archive

Crossref

Construction of an ~700-kb transcript map around the Familial Mediterranean Fever locus on human chromosome 16p13.3

Author: Adams
Aksentijevich
Andrea Cercek
Anil Vedula
Antequera
Buckland
Calabro
Chen
Dahl
Daniel L. Kastner
Darrell O. Ricke
David F. Callen
David Krizman
Deborah Gumucio
Elizabeth Mansfield
Francis S. Collins
Geryl Wood
Huebner
Ivona Aksentijevich
Jingmei Liu
Kulp
Lancet
Melanie Hamon
Michael Centola
Nathan Fischel-Ghodsian
Neil Richards
Neta Shafran
Norman A. Doggett
Nurit Zaks
P. Paul Liu
Pras
Puder
Raman Sood
Robert I. Richards
Robert K. Moyzis
Sinoula Apostolou
Tanaz Kahan
Trevor Blake
Xiang Chen
Xiaoguang Chen
Yasuda
Yokoyama
Zuoming Deng
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/1998
Field of study

We used a combination of cDNA selection, exon amplification, and computational prediction from genomic sequence to isolate transcribed sequences from genomic DNA surrounding the familial Mediterranean fever (FMF) locus. Eighty-seven kb of genomic DNA around D16S3370, a marker showing a high degree of linkage disequilibrium with FMF, was sequenced to completion, and the sequence annotated. A transcript map reflecting the minimal number of genes encoded within the ∼700 kb of genomic DNA surrounding the FMF locus was assembled. This map consists of 27 genes with discreet messages detectable on Northerns, in addition to three olfactory-receptor genes, a cluster of 18 tRNA genes, and two putative transcriptional units that have typical intron–exon splice junctions yet do not detect messages on Northerns. Four of the transcripts are identical to genes described previously, seven have been independently identified by the French FMF Consortium, and the others are novel. Six related zinc-finger genes, a cluster of tRNAs, and three olfactory receptors account for the majority of transcribed sequences isolated from a 315-kb FMF central region (betweenD16S468/D16S3070 and cosmid 377A12). Interspersed among them are several genes that may be important in inflammation. This transcript map not only has permitted the identification of the FMF gene (MEFV), but also has provided us an opportunity to probe the structural and functional features of this region of chromosome 16.Michael Centola, Xiaoguang Chen, Raman Sood, Zuoming Deng, Ivona Aksentijevich, Trevor Blake, Darrell O. Ricke, Xiang Chen, Geryl Wood, Nurit Zaks, Neil Richards, David Krizman, Elizabeth Mansfield, Sinoula Apostolou, Jingmei Liu, Neta Shafran, Anil Vedula, Melanie Hamon, Andrea Cercek, Tanaz Kahan, Deborah Gumucio, David F. Callen, Robert I. Richards, Robert K. Moyzis, Norman A. Doggett, Francis S. Collins, P. Paul Liu, Nathan Fischel-Ghodsian and Daniel L. Kastne

Crossref

Adelaide Research & Scholarship

Two Different Antibody-Dependent Enhancement (ADE) Risks for SARS-CoV-2 Antibodies

Author: Ricke Darrell O.
Publication venue: 'Frontiers Media SA'
Publication date: 01/12/2020
Field of study

COVID-19 (SARS-CoV-2) disease severity and stages varies from asymptomatic, mild flu-like symptoms, moderate, severe, critical, and chronic disease. COVID-19 disease progression include lymphopenia, elevated proinflammatory cytokines and chemokines, accumulation of macrophages and neutrophils in lungs, immune dysregulation, cytokine storms, acute respiratory distress syndrome (ARDS), etc. Development of vaccines to severe acute respiratory syndrome (SARS), Middle East Respiratory Syndrome coronavirus (MERS-CoV), and other coronavirus has been difficult to create due to vaccine induced enhanced disease responses in animal models. Multiple betacoronaviruses including SARS-CoV-2 and SARS-CoV-1 expand cellular tropism by infecting some phagocytic cells (immature macrophages and dendritic cells) via antibody bound Fc receptor uptake of virus. Antibody-dependent enhancement (ADE) may be involved in the clinical observation of increased severity of symptoms associated with early high levels of SARS-CoV-2 antibodies in patients. Infants with multisystem inflammatory syndrome in children (MIS-C) associated with COVID-19 may also have ADE caused by maternally acquired SARS-CoV-2 antibodies bound to mast cells. ADE risks associated with SARS-CoV-2 has implications for COVID-19 and MIS-C treatments, B-cell vaccines, SARS-CoV-2 antibody therapy, and convalescent plasma therapy for patients. SARS-CoV-2 antibodies bound to mast cells may be involved in MIS-C and multisystem inflammatory syndrome in adults (MIS-A) following initial COVID-19 infection. SARS-CoV-2 antibodies bound to Fc receptors on macrophages and mast cells may represent two different mechanisms for ADE in patients. These two different ADE risks have possible implications for SARS-CoV-2 B-cell vaccines for subsets of populations based on age, cross-reactive antibodies, variabilities in antibody levels over time, and pregnancy. These models place increased emphasis on the importance of developing safe SARS-CoV-2 T cell vaccines that are not dependent upon antibodies

DSpace@MIT

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

In vivo Monitoring of Transcriptional Dynamics After Lower-Limb Muscle Injury Enables Quantitative Classification of Healing

Author: Aguilar Carlos A.
Carrigan Christopher T.
Gifford Casey A.
Kottke Melissa A.
Meissner Alexander
Pop Ramona
Ricke Darrell O.
Shcherbina Anna
Urso Maria L.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2015
Field of study

Traumatic lower-limb musculoskeletal injuries are pervasive amongst athletes and the military and typically an individual returns to activity prior to fully healing, increasing a predisposition for additional injuries and chronic pain. Monitoring healing progression after a musculoskeletal injury typically involves different types of imaging but these approaches suffer from several disadvantages. Isolating and profiling transcripts from the injured site would abrogate these shortcomings and provide enumerative insights into the regenerative potential of an individual’s muscle after injury. In this study, a traumatic injury was administered to a mouse model and healing progression was examined from 3 hours to 1 month using high-throughput RNA-Sequencing (RNA-Seq). Comprehensive dissection of the genome-wide datasets revealed the injured site to be a dynamic, heterogeneous environment composed of multiple cell types and thousands of genes undergoing significant expression changes in highly regulated networks. Four independent approaches were used to determine the set of genes, isoforms, and genetic pathways most characteristic of different time points post-injury and two novel approaches were developed to classify injured tissues at different time points. These results highlight the possibility to quantitatively track healing progression in situ via transcript profiling using high- throughput sequencing

DSpace@MIT

Harvard University - DASH

PubMed Central

Early onset of industrial-era warming across the oceans and continents

Author: A Bakun
A Moreno
A Schimmelmann
AP Schurer
AP Schurer
AW Tudhope
BK Linsley
BK Linsley
BK Linsley
BL Otto-Bliesner
C Alibert
C MacFarling Meure
C Mora
C Saenger
CD Charles
CD Charles
D Gutiérrez
Darrell S. Kaufman
DC Frank
DE Black
DJ Seidel
DW Oppo
E Crespin
E Hawkins
E Ruggieri
EJ Hendy
EJ Steig
ERM Druffel
F Abrantes
F Ji
FE Urban
G Leduc
GA Schmidt
GA Schmidt
GC Hegerl
GJM Versteegh
H Doose-Rolinski
H Kuhnert
H Kuhnert
H Kuhnert
H Kuhnert
HC Wu
Helen V. McGregor
HV McGregor
HV McGregor
I Harris
IL Hendy
IS Nurhati
IS Nurhati
J Hannig
J Marshall
J Zinke
J Zinke
J Zinke
JA Screen
JE Cole
JE Cole
JE Cole
JE Tierney
Jessica E. Tierney
JP Nicolas
K Lyu
K Pahnke
KC Armour
KE Taylor
KH Kilbourne
KL DeLong
KL DeLong
KL DeLong
KL Ricke
KM Cobb
L Landrum
LF Vásquez-Bedoya
M Boiseau
M Boiseau
M Collins
M Mudelsee
M Pfeiffer
M Sigl
M Zhao
MA Goni
MA Sicre
ME Mann
MH England
Michael N. Evans
MK Gorman
N Diffenbaugh
N Nakamura
N Narayan
NA Rayner
Nerilie J. Abram
NF Goodkin
Nicholas P. McKay
NJ Abram
NJ Abram
NJ Abram
NP McKay
O Bothe
PA Stott
PK Swart
PK Swart
PW Staten
R Asami
R Neukom
RB Dunbar
S Bagnato
S Bonnet
S Brönnimann
S Hetzinger
S Levitus
S Rahmstorf
S Watanabe
S-P Xie
SJ Phipps
SL Lewis
T Felis
T Felis
TD Damassa
TJ Bracegirdle
TJ Crowley
TJ Crowley
TM Quinn
TM Quinn
TM Quinn
TO Richter
TP Guilderson
TR Karl
V Nieto-Moreno
WF Ruddiman
X Chen
Y Hu
Z Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The evolution of industrial-era warming across the continents and oceans provides a context for future climate change and is important for determining climate sensitivity and the processes that control regional warming. Here we use post-ad 1500 palaeoclimate records to show that sustained industrial-era warming of the tropical oceans first developed during the mid-nineteenth century and was nearly synchronous with Northern Hemisphere continental warming. The early onset of sustained, significant warming in palaeoclimate records and model simulations suggests that greenhouse forcing of industrial-era warming commenced as early as the mid-nineteenth century and included an enhanced equatorial ocean response mechanism. The development of Southern Hemisphere warming is delayed in reconstructions, but this apparent delay is not reproduced in climate simulations. Our findings imply that instrumental records are too short to comprehensively assess anthropogenic climate change and that, in some regions, about 180 years of industrial-era warming has already caused surface temperatures to emerge above pre-industrial values, even when taking natural variability into account

HAL AMU

HAL Descartes

The Australian National University

HAL-CEA

espace@Curtin

Lund University Publications

Bern Open Repository and Information System (BORIS)

DIAL UCLouvain

HAL UVSQ

HAL-Polytechnique

Leicester Research Archive